48 research outputs found
Formulating Camera-Adaptive Color Constancy as a Few-shot Meta-Learning Problem
Digital camera pipelines employ color constancy methods to estimate an
unknown scene illuminant, in order to re-illuminate images as if they were
acquired under an achromatic light source. Fully-supervised learning approaches
exhibit state-of-the-art estimation accuracy with camera-specific labelled
training imagery. Resulting models typically suffer from domain gaps and fail
to generalise across imaging devices. In this work, we propose a new approach
that affords fast adaptation to previously unseen cameras, and robustness to
changes in capture device by leveraging annotated samples across different
cameras and datasets. We present a general approach that utilizes the concept
of color temperature to frame color constancy as a set of distinct, homogeneous
few-shot regression tasks, each associated with an intuitive physical meaning.
We integrate this novel formulation within a meta-learning framework, enabling
fast generalisation to previously unseen cameras using only handfuls of camera
specific training samples. Consequently, the time spent for data collection and
annotation substantially diminishes in practice whenever a new sensor is used.
To quantify this gain, we evaluate our pipeline on three publicly available
datasets comprising 12 different cameras and diverse scene content. Our
approach delivers competitive results both qualitatively and quantitatively
while requiring a small fraction of the camera-specific samples compared to
standard approaches.Comment: First two authors contributed equall
Learning to Sample the Most Useful Training Patches from Images
Some image restoration tasks like demosaicing require difficult training
samples to learn effective models. Existing methods attempt to address this
data training problem by manually collecting a new training dataset that
contains adequate hard samples, however, there are still hard and simple areas
even within one single image. In this paper, we present a data-driven approach
called PatchNet that learns to select the most useful patches from an image to
construct a new training set instead of manual or random selection. We show
that our simple idea automatically selects informative samples out from a
large-scale dataset, leading to a surprising 2.35dB generalisation gain in
terms of PSNR. In addition to its remarkable effectiveness, PatchNet is also
resource-friendly as it is applied only during training and therefore does not
require any additional computational cost during inference
SteReFo: Efficient Image Refocusing with Stereo Vision
Whether to attract viewer attention to a particular object, give the
impression of depth or simply reproduce human-like scene perception, shallow
depth of field images are used extensively by professional and amateur
photographers alike. To this end, high quality optical systems are used in DSLR
cameras to focus on a specific depth plane while producing visually pleasing
bokeh. We propose a physically motivated pipeline to mimic this effect from
all-in-focus stereo images, typically retrieved by mobile cameras. It is
capable to change the focal plane a posteriori at 76 FPS on KITTI images to
enable real-time applications. As our portmanteau suggests, SteReFo
interrelates stereo-based depth estimation and refocusing efficiently. In
contrast to other approaches, our pipeline is simultaneously fully
differentiable, physically motivated, and agnostic to scene content. It also
enables computational video focus tracking for moving objects in addition to
refocusing of static images. We evaluate our approach on the publicly available
datasets SceneFlow, KITTI, CityScapes and quantify the quality of architectural
changes
AIM 2019 Challenge on Image Demoireing: Dataset and Study
This paper introduces a novel dataset, called LCDMoire, which was created for
the first-ever image demoireing challenge that was part of the Advances in
Image Manipulation (AIM) workshop, held in conjunction with ICCV 2019. The
dataset comprises 10,200 synthetically generated image pairs (consisting of an
image degraded by moire and a clean ground truth image). In addition to
describing the dataset and its creation, this paper also reviews the challenge
tracks, competition, and results, the latter summarizing the current
state-of-the-art on this dataset
Image Demoireing with Learnable Bandpass Filters
Image demoireing is a multi-faceted image restoration task involving both
texture and color restoration. In this paper, we propose a novel multiscale
bandpass convolutional neural network (MBCNN) to address this problem. As an
end-to-end solution, MBCNN respectively solves the two sub-problems. For
texture restoration, we propose a learnable bandpass filter (LBF) to learn the
frequency prior for moire texture removal. For color restoration, we propose a
two-step tone mapping strategy, which first applies a global tone mapping to
correct for a global color shift, and then performs local fine tuning of the
color per pixel. Through an ablation study, we demonstrate the effectiveness of
the different components of MBCNN. Experimental results on two public datasets
show that our method outperforms state-of-the-art methods by a large margin
(more than 2dB in terms of PSNR).Comment: Accepted by CVPR2020. Code is available at
https://github.com/zhenngbolun/Learnbale_Bandpass_Filte
A continual learning survey: Defying forgetting in classification tasks
Artificial neural networks thrive in solving the classification problem for a
particular rigid task, acquiring knowledge through generalized learning
behaviour from a distinct training phase. The resulting network resembles a
static entity of knowledge, with endeavours to extend this knowledge without
targeting the original task resulting in a catastrophic forgetting. Continual
learning shifts this paradigm towards networks that can continually accumulate
knowledge over different tasks without the need to retrain from scratch. We
focus on task incremental classification, where tasks arrive sequentially and
are delineated by clear boundaries. Our main contributions concern 1) a
taxonomy and extensive overview of the state-of-the-art, 2) a novel framework
to continually determine the stability-plasticity trade-off of the continual
learner, 3) a comprehensive experimental comparison of 11 state-of-the-art
continual learning methods and 4 baselines. We empirically scrutinize method
strengths and weaknesses on three benchmarks, considering Tiny Imagenet and
large-scale unbalanced iNaturalist and a sequence of recognition datasets. We
study the influence of model capacity, weight decay and dropout regularization,
and the order in which the tasks are presented, and qualitatively compare
methods in terms of required memory, computation time, and storage
DeepLPF: Deep Local Parametric Filters for Image Enhancement
Digital artists often improve the aesthetic quality of digital photographs
through manual retouching. Beyond global adjustments, professional image
editing programs provide local adjustment tools operating on specific parts of
an image. Options include parametric (graduated, radial filters) and
unconstrained brush tools. These highly expressive tools enable a diverse set
of local image enhancements. However, their use can be time consuming, and
requires artistic capability. State-of-the-art automated image enhancement
approaches typically focus on learning pixel-level or global enhancements. The
former can be noisy and lack interpretability, while the latter can fail to
capture fine-grained adjustments. In this paper, we introduce a novel approach
to automatically enhance images using learned spatially local filters of three
different types (Elliptical Filter, Graduated Filter, Polynomial Filter). We
introduce a deep neural network, dubbed Deep Local Parametric Filters
(DeepLPF), which regresses the parameters of these spatially localized filters
that are then automatically applied to enhance the image. DeepLPF provides a
natural form of model regularization and enables interpretable, intuitive
adjustments that lead to visually pleasing results. We report on multiple
benchmarks and show that DeepLPF produces state-of-the-art performance on two
variants of the MIT-Adobe-5K dataset, often using a fraction of the parameters
required for competing methods.Comment: Accepted for publication at CVPR202
Low Light Video Enhancement using Synthetic Data Produced with an Intermediate Domain Mapping
Advances in low-light video RAW-to-RGB translation are opening up the
possibility of fast low-light imaging on commodity devices (e.g. smartphone
cameras) without the need for a tripod. However, it is challenging to collect
the required paired short-long exposure frames to learn a supervised mapping.
Current approaches require a specialised rig or the use of static videos with
no subject or object motion, resulting in datasets that are limited in size,
diversity, and motion. We address the data collection bottleneck for low-light
video RAW-to-RGB by proposing a data synthesis mechanism, dubbed SIDGAN, that
can generate abundant dynamic video training pairs. SIDGAN maps videos found
'in the wild' (e.g. internet videos) into a low-light (short, long exposure)
domain. By generating dynamic video data synthetically, we enable a recently
proposed state-of-the-art RAW-to-RGB model to attain higher image quality
(improved colour, reduced artifacts) and improved temporal consistency,
compared to the same model trained with only static real video data.Comment: Accepted to ECCV 202
NODE: Extreme Low Light Raw Image Denoising using a Noise Decomposition Network
Denoising extreme low light images is a challenging task due to the high
noise level. When the illumination is low, digital cameras increase the ISO
(electronic gain) to amplify the brightness of captured data. However, this in
turn amplifies the noise, arising from read, shot, and defective pixel sources.
In the raw domain, read and shot noise are effectively modelled using Gaussian
and Poisson distributions respectively, whereas defective pixels can be modeled
with impulsive noise. In extreme low light imaging, noise removal becomes a
critical challenge to produce a high quality, detailed image with low noise. In
this paper, we propose a multi-task deep neural network called Noise
Decomposition (NODE) that explicitly and separately estimates defective pixel
noise, in conjunction with Gaussian and Poisson noise, to denoise an extreme
low light image. Our network is purposely designed to work with raw data, for
which the noise is more easily modeled before going through non-linear
transformations in the image signal processing (ISP) pipeline. Quantitative and
qualitative evaluation show the proposed method to be more effective at
denoising real raw images than state-of-the-art techniques
Pixel Adaptive Filtering Units
State-of-the-art methods for computer vision rely heavily on the translation
equivariance and spatial sharing properties of convolutional layers without
explicitly taking into consideration the input content. Modern techniques
employ deep sophisticated architectures in order to circumvent this issue. In
this work, we propose a Pixel Adaptive Filtering Unit (PAFU) which introduces a
differentiable kernel selection mechanism paired with a discrete, learnable and
decorrelated group of kernels to allow for content-based spatial adaptation.
First, we demonstrate the applicability of the technique in applications where
runtime is of importance. Next, we employ PAFU in deep neural networks as a
replacement of standard convolutional layers to enhance the original
architectures with spatially varying computations to achieve considerable
performance improvements. Finally, diverse and extensive experimentation
provides strong empirical evidence in favor of the proposed content-adaptive
processing scheme across different image processing and high-level computer
vision tasks